642 research outputs found

    Improved sparse approximation over quasi-incoherent dictionaries

    Get PDF
    This paper discusses a new greedy algorithm for solving the sparse approximation problem over quasi-incoherent dictionaries. These dictionaries consist of waveforms that are uncorrelated "on average," and they provide a natural generalization of incoherent dictionaries. The algorithm provides strong guarantees on the quality of the approximations it produces, unlike most other methods for sparse approximation. Moreover, very efficient implementations are possible via approximate nearest-neighbor data structure

    One Table to Count Them All: Parallel Frequency Estimation on Single-Board Computers

    Get PDF
    Sketches are probabilistic data structures that can provide approximate results within mathematically proven error bounds while using orders of magnitude less memory than traditional approaches. They are tailored for streaming data analysis on architectures even with limited memory such as single-board computers that are widely exploited for IoT and edge computing. Since these devices offer multiple cores, with efficient parallel sketching schemes, they are able to manage high volumes of data streams. However, since their caches are relatively small, a careful parallelization is required. In this work, we focus on the frequency estimation problem and evaluate the performance of a high-end server, a 4-core Raspberry Pi and an 8-core Odroid. As a sketch, we employed the widely used Count-Min Sketch. To hash the stream in parallel and in a cache-friendly way, we applied a novel tabulation approach and rearranged the auxiliary tables into a single one. To parallelize the process with performance, we modified the workflow and applied a form of buffering between hash computations and sketch updates. Today, many single-board computers have heterogeneous processors in which slow and fast cores are equipped together. To utilize all these cores to their full potential, we proposed a dynamic load-balancing mechanism which significantly increased the performance of frequency estimation.Comment: 12 pages, 4 figures, 3 algorithms, 1 table, submitted to EuroPar'1

    Systematic mapping review on student’s performance analysis using big data predictive model

    Get PDF
    This paper classify the various existing predicting models that are used for monitoring andimproving students’ performance at schools and higher learning institutions. It analyses all theareas within the educational data mining methodology. Two databases were chosen for thisstudy and a systematic mapping study was performed. Due to the very infant stage of thisresearch area, only 114 articles published from 2012 till 2016 were identified. Within this, atotal of 59 articles were reviewed and classified. There is an increased interest and research inthe area of educational data mining, particularly in improving students’ performance withvarious predictive and prescriptive models. Most of the models are devised for pedagogicalimprovements ultimately. It is a huge scarcity in producing portable predictive models that fitsinto any educational environment. There is more research needed in the educational big data.Keywords: predictive analysis; student’s performance; big data; big data analytics; datamining; systematic mapping study

    Spatially embedded random networks

    No full text
    Many real-world networks analyzed in modern network theory have a natural spatial element; e.g., the Internet, social networks, neural networks, etc. Yet, aside from a comparatively small number of somewhat specialized and domain-specific studies, the spatial element is mostly ignored and, in particular, its relation to network structure disregarded. In this paper we introduce a model framework to analyze the mediation of network structure by spatial embedding; specifically, we model connectivity as dependent on the distance between network nodes. Our spatially embedded random networks construction is not primarily intended as an accurate model of any specific class of real-world networks, but rather to gain intuition for the effects of spatial embedding on network structure; nevertheless we are able to demonstrate, in a quite general setting, some constraints of spatial embedding on connectivity such as the effects of spatial symmetry, conditions for scale free degree distributions and the existence of small-world spatial networks. We also derive some standard structural statistics for spatially embedded networks and illustrate the application of our model framework with concrete examples

    Stochastic Budget Optimization in Internet Advertising

    Full text link
    Internet advertising is a sophisticated game in which the many advertisers "play" to optimize their return on investment. There are many "targets" for the advertisements, and each "target" has a collection of games with a potentially different set of players involved. In this paper, we study the problem of how advertisers allocate their budget across these "targets". In particular, we focus on formulating their best response strategy as an optimization problem. Advertisers have a set of keywords ("targets") and some stochastic information about the future, namely a probability distribution over scenarios of cost vs click combinations. This summarizes the potential states of the world assuming that the strategies of other players are fixed. Then, the best response can be abstracted as stochastic budget optimization problems to figure out how to spread a given budget across these keywords to maximize the expected number of clicks. We present the first known non-trivial poly-logarithmic approximation for these problems as well as the first known hardness results of getting better than logarithmic approximation ratios in the various parameters involved. We also identify several special cases of these problems of practical interest, such as with fixed number of scenarios or with polynomial-sized parameters related to cost, which are solvable either in polynomial time or with improved approximation ratios. Stochastic budget optimization with scenarios has sophisticated technical structure. Our approximation and hardness results come from relating these problems to a special type of (0/1, bipartite) quadratic programs inherent in them. Our research answers some open problems raised by the authors in (Stochastic Models for Budget Optimization in Search-Based Advertising, Algorithmica, 58 (4), 1022-1044, 2010).Comment: FINAL versio

    Learning Best Response Strategies for Agents in Ad Exchanges

    Full text link
    Ad exchanges are widely used in platforms for online display advertising. Autonomous agents operating in these exchanges must learn policies for interacting profitably with a diverse, continually changing, but unknown market. We consider this problem from the perspective of a publisher, strategically interacting with an advertiser through a posted price mechanism. The learning problem for this agent is made difficult by the fact that information is censored, i.e., the publisher knows if an impression is sold but no other quantitative information. We address this problem using the Harsanyi-Bellman Ad Hoc Coordination (HBA) algorithm, which conceptualises this interaction in terms of a Stochastic Bayesian Game and arrives at optimal actions by best responding with respect to probabilistic beliefs maintained over a candidate set of opponent behaviour profiles. We adapt and apply HBA to the censored information setting of ad exchanges. Also, addressing the case of stochastic opponents, we devise a strategy based on a Kaplan-Meier estimator for opponent modelling. We evaluate the proposed method using simulations wherein we show that HBA-KM achieves substantially better competitive ratio and lower variance of return than baselines, including a Q-learning agent and a UCB-based online learning agent, and comparable to the offline optimal algorithm

    The Tree Inclusion Problem: In Linear Space and Faster

    Full text link
    Given two rooted, ordered, and labeled trees PP and TT the tree inclusion problem is to determine if PP can be obtained from TT by deleting nodes in TT. This problem has recently been recognized as an important query primitive in XML databases. Kilpel\"ainen and Mannila [\emph{SIAM J. Comput. 1995}] presented the first polynomial time algorithm using quadratic time and space. Since then several improved results have been obtained for special cases when PP and TT have a small number of leaves or small depth. However, in the worst case these algorithms still use quadratic time and space. Let nSn_S, lSl_S, and dSd_S denote the number of nodes, the number of leaves, and the %maximum depth of a tree S{P,T}S \in \{P, T\}. In this paper we show that the tree inclusion problem can be solved in space O(nT)O(n_T) and time: O(\min(l_Pn_T, l_Pl_T\log \log n_T + n_T, \frac{n_Pn_T}{\log n_T} + n_{T}\log n_{T})). This improves or matches the best known time complexities while using only linear space instead of quadratic. This is particularly important in practical applications, such as XML databases, where the space is likely to be a bottleneck.Comment: Minor updates from last tim

    Managing Risk of Bidding in Display Advertising

    Full text link
    In this paper, we deal with the uncertainty of bidding for display advertising. Similar to the financial market trading, real-time bidding (RTB) based display advertising employs an auction mechanism to automate the impression level media buying; and running a campaign is no different than an investment of acquiring new customers in return for obtaining additional converted sales. Thus, how to optimally bid on an ad impression to drive the profit and return-on-investment becomes essential. However, the large randomness of the user behaviors and the cost uncertainty caused by the auction competition may result in a significant risk from the campaign performance estimation. In this paper, we explicitly model the uncertainty of user click-through rate estimation and auction competition to capture the risk. We borrow an idea from finance and derive the value at risk for each ad display opportunity. Our formulation results in two risk-aware bidding strategies that penalize risky ad impressions and focus more on the ones with higher expected return and lower risk. The empirical study on real-world data demonstrates the effectiveness of our proposed risk-aware bidding strategies: yielding profit gains of 15.4% in offline experiments and up to 17.5% in an online A/B test on a commercial RTB platform over the widely applied bidding strategies

    On Exchange of Orbital Angular Momentum Between Twisted Photons and Atomic Electrons

    Full text link
    We obtain an expression for the matrix element for a twisted (Laguerre-Gaussian profile) photon scattering from a hydrogen atom. We consider photons incoming with an orbital angular momentum (OAM) of \ell \hbar, carried by a factor of eiϕe^{i \ell \phi} not present in a plane-wave or pure Gaussian profile beam. The nature of the transfer of +2+2\ell units of OAM from the photon to the azimuthal atomic quantum number of the atom is investigated. We obtain simple formulae for these OAM flip transitions for elastic forward scattering of twisted photons when the photon wavelength λ\lambda is large compared with the atomic target size aa, and small compared the Rayleigh range zRz_R, which characterizes the collimation length of the twisted photon beam.Comment: 16 page

    Cross-Document Pattern Matching

    Get PDF
    We study a new variant of the string matching problem called cross-document string matching, which is the problem of indexing a collection of documents to support an efficient search for a pattern in a selected document, where the pattern itself is a substring of another document. Several variants of this problem are considered, and efficient linear-space solutions are proposed with query time bounds that either do not depend at all on the pattern size or depend on it in a very limited way (doubly logarithmic). As a side result, we propose an improved solution to the weighted level ancestor problem
    corecore